NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Auto-Differentiation of Relational Computations for Very Large Scale Machine Learning

Tang, Yuxin; Ding, Zhimin; Jankov, Dimitrije; Yuan, Binhang; Bourgeois, Daniel; Jermaine, Chris (July 2023, Proceedings of Machine Learning Research (ICML))

The relational data model was designed to facilitate large-scale data management and analytics. We consider the problem of how to differentiate computations expressed relationally. We show experimentally that a relational engine running an auto-differentiated relational algorithm can easily scale to very large datasets, and is competitive with state-of-the-art, special-purpose systems for large-scale distributed machine learning.
more » « less
Full Text Available
Auto-Differentiation of Relational Computations for Very Large Scale Machine Learning

Tang, Yuxin; Ding, Zhimin; Jankov, Dimitrije; Yuan, Binhang; Bourgeois, Daniel; Jermaine, Chris (July 2023, Proceedings of Machine Learning Research)

Full Text Available
Auto-Differentiation of Relational Computations for Very Large Scale Machine Learning

Tang, Yuxin; Ding, Zhimin; Jankov, Dimitrije; Yuan, Binhang; Bourgeois, Daniel; Jermaine, Chris (July 2023, International Conference on Machine Learning)
Serving Deep Learning Models from Relational Databases

https://doi.org/10.48786/edbt.2024.61

Zhou, Lixi; Lin, Qi; Chowdhury, Kanchan; Masood, Saif; Eichenberger, Alexandre; Min, Hong; Sim, Alexander; Wang, Jie; Wang, Yida; Wu, Kesheng; et al (January 2024, OpenProceedings.org)

Serving deep learning (DL) models on relational data has become a critical requirement across diverse commercial and scientific domains, sparking growing interest recently. In this visionary paper, we embark on a comprehensive exploration of representative architectures to address the requirement. We highlight three pivotal paradigms: The state-of-the-art \textit{DL-centric} architecture offloads DL computations to dedicated DL frameworks. The potential \textit{UDF-centric} architecture encapsulates one or more tensor computations into User Defined Functions (UDFs) within the relational database management system (RDBMS). The potential \textit{relation-centric} architecture aims to represent a large-scale tensor computation through relational operators. While each of these architectures demonstrates promise in specific use scenarios, we identify urgent requirements for seamless integration of these architectures and the middle ground in-between these architectures. We delve into the gaps that impede the integration and explore innovative strategies to close them. We present a pathway to establish a novel RDBMS for enabling a broad class of data-intensive DL inference applications.
more » « less
Distributed learning of fully connected neural networks using independent subnet training

https://doi.org/10.14778/3529337.3529343

Yuan, Binhang; Wolfe, Cameron R.; Dun, Chen; Tang, Yuxin; Kyrillidis, Anastasios; Jermaine, Chris (April 2022, Proceedings of the VLDB Endowment)

Full Text Available
Automatic Optimization of Matrix Implementations for Distributed Machine Learning and Linear Algebra

https://doi.org/10.1145/3448016.3457317

Luo, Shangyu; Jankov, Dimitrije; Yuan, Binhang; Jermaine, Chris (June 2021, ACM SIGMOD Conference 2021)
null (Ed.)
Full Text Available
Distributed numerical and machine learning computations via two-phase execution of aggregated join trees

https://doi.org/10.14778/3450980.3450991

Jankov, Dimitrije; Yuan, Binhang; Luo, Shangyu; Jermaine, Chris (March 2021, Proceedings of the VLDB Endowment)
null (Ed.)
When numerical and machine learning (ML) computations are expressed relationally, classical query execution strategies (hash-based joins and aggregations) can do a poor job distributing the computation. In this paper, we propose a two-phase execution strategy for numerical computations that are expressed relationally, as aggregated join trees (that is, expressed as a series of relational joins followed by an aggregation). In a pilot run, lineage information is collected; this lineage is used to optimally plan the computation at the level of individual records. Then, the computation is actually executed. We show experimentally that a relational system making use of this two-phase strategy can be an excellent platform for distributed ML computations.
more » « less
Full Text Available
Tensor relational algebra for distributed machine learning system design

https://doi.org/10.14778/3457390.3457399

Yuan, Binhang; Jankov, Dimitrije; Zou, Jia; Tang, Yuxin; Bourgeois, Daniel; Jermaine, Chris (April 2021, Proceedings of the VLDB Endowment)

We consider the question: what is the abstraction that should be implemented by the computational engine of a machine learning system? Current machine learning systems typically push whole tensors through a series of compute kernels such as matrix multiplications or activation functions, where each kernel runs on an AI accelerator (ASIC) such as a GPU. This implementation abstraction provides little built-in support for ML systems to scale past a single machine, or for handling large models with matrices or tensors that do not easily fit into the RAM of an ASIC. In this paper, we present an alternative implementation abstraction called the tensor relational algebra (TRA). The TRA is a set-based algebra based on the relational algebra. Expressions in the TRA operate over binary tensor relations, where keys are multi-dimensional arrays and values are tensors. The TRA is easily executed with high efficiency in a parallel or distributed environment, and amenable to automatic optimization. Our empirical study shows that the optimized TRA-based back-end can significantly outperform alternatives for running ML workflows in distributed clusters.
more » « less
Full Text Available

Search for: All records